By the end of this tutorial, you will go through:
The Space Telescope Imaging Spectrograph (STIS) is a versatile imaging spectrograph installed on the Hubble Space Telescope (HST), covering a wide range of wavelengths from the near-infrared region into the ultraviolet.
This tutorial aims to prepare new users to begin analyzing STIS Data by going through downloading data, reading and viewing spectrum, and viewing STIS image.
There are three detectors on STIS: FUV-MAMA, NUV-MAMA, and CCD. While the detectors are designed for different scientific purposes and operate at different wavelength, their data are organized in the same structure. Thus we are using FUV-MAMA data as an example in this notebook, and leave the rest as exercises to the users.
For detailed overview of the STIS instrument, see the STIS Instrument Handbook. \ For more information on STIS data analysis and operations, see the STIS Data Handbook.
Defining some terms:
We will import the following packages:
astropy.io fits for accessing FITS filesastropy.table Table for creating tidy tables of the dataastroquery.mast Observations for finding and downloading data from the MAST archivepathlib for managing system pathsmatplotlib.pyplot for plotting dataIPython.display for formatting displaynumpy to handle array functionspandas to make basic tables and dataframesstistools for quick operations on STIS Data# Import for: Reading in fits file
from astropy.table import Table
from astropy.io import fits
# Import for: Downloading necessary files. (Not necessary if you choose to collect data from MAST)
from astroquery.mast import Observations
# Import for: Managing system variables and paths
from pathlib import Path
# Import for: Plotting and specifying plotting parameters
import matplotlib.pyplot as plt
import matplotlib
import matplotlib.cm as cm
from IPython.display import display
# Import for: Quick Calculation and Data Analysis
import numpy as np
import pandas as pd
import math
# Import for operations on STIS Data
import stistools
astroquery¶There are other ways to download data from MAST such as using CyberDuck. We are only showing how to use astroquery in this notebook
# make directory for downloading data
datadir = Path('./data')
datadir.mkdir(exist_ok=True)
# Search target objscy by obs_id
target = Observations.query_criteria(obs_id='ODJ102010')
# get a list of files assiciated with that target
FUV_list = Observations.get_product_list(target)
# Download only the SCIENCE fits files
Observations.download_products(FUV_list,productType="SCIENCE",extension='fits',download_dir=str(datadir))
Before doing any operation on the data, we want to first explore the basics and data file structures.
The Aperture extracted, background subtracted, flux and wavelength calibrated spectra data is stored in fits file with suffix _x1d. While we are using _x1d fits file as an example of investigating STIS table data, the following method of reading in data and viewing spectrum or other fields can be applied to any table data, either calibrated or uncalibrated. For more information on STIS file naming conventions, see Types of STIS Files.
Open the x1d fits file and explore its info and header:
#get information about the fits file
x1d_file=Path("./data/mastDownload/HST/odj102010/odj102010_x1d.fits")
fits.info(x1d_file)
The primary header that stores keyword information describing the global properties of all of the exposures in the file
#get header of the fits file
x1d_header_0 = fits.getheader(x1d_file,0)
try:
print(x1d_header_0["instrument"])
except KeyError:
print(x1d_header_0["primesi"])
for key in ["detector","obsmode", "opt_elem", "targname"]:
print("{0}: {1}".format(key, x1d_header_0[key]))
print()
You can change the keys to check for other fields and metadata, or directly print the x1d_header to get all header information.
Some other metadata, such as exposure data and time, are stored in the first extension.
x1d_header_1 = fits.getheader(x1d_file,1)
date = x1d_header_1["DATE-OBS"]
time = x1d_header_1["Time-OBS"]
exptime = x1d_header_1["EXPTIME"]
print("The data were taken on {d}, starting at {t}, with the exposure time of {e} seconds".format(d=date,t=time,e=exptime))
The main science data is stored at the first extension of the x1d fits file. We first read in the data as an astropy table.
#get data
x1d_data = Table.read(x1d_file,1)
# Display a representation of the data table:
x1d_data
# We can also get the columns of this table:
columns = x1d_data.colnames
columns
Another popular way of reading in fits data from hdu list as "FITS_rec":
with fits.open(x1d_file) as hdulist:
fuv_x1d_data = hdulist[1].data
The actual data of each columns are stored in arrays with equal lengths. We collect the actual spectrum data and plot it, together with the error bar.
# From the astropy table, we first get all the data we need: wavelength, flux, and error
# notice that for astropy table, the column name is case sensitive
wl,flux,err=x1d_data[0]["WAVELENGTH","FLUX","ERROR"]
# make a plot of the data, use this cell to specify the format of the plot.
matplotlib.rcParams['figure.figsize'] = (20,7)
plt.style.use("bmh")
plt.plot(wl, flux, #the x-data, y-data, and y-axis error
marker=".",markersize="2",markerfacecolor='w', markeredgewidth=0) #specifies the data points style
plt.title("STIS FUV Spectrum")
plt.xlabel('Wavelength [' + chr(197) +']')
plt.ylabel("Flux [ergs/s/cm$^2$/" + chr(197) +"]")
You can zoom in to a specific wavelength range using a mask
# create a mask on wavelength between 1540 Angstrom and 1560 Angstrom
zoom_mask = (wl > 1540) & (wl < 1560)
# apply the created mask to the wavelength, flux, and error
wl_region, flux_region, err_region = wl[zoom_mask],flux[zoom_mask],err[zoom_mask]
# plot the spectrum in that region
plt.plot(wl_region, flux_region, #the x-data, y-data, and y-axis error
marker=".",markersize="2",markerfacecolor='w', markeredgewidth=0, #specifies the data points style
color="blue") #specifies the format of lines
plt.title("STIS FUV Spectrum")
plt.xlabel('Wavelength [' + chr(197) +']')
plt.ylabel("Flux [ergs/s/cm$^2$/" + chr(197) +"]")
You can also plot the error bar together with the spectrum. For more errorbar styling options, see matplotlib.pyplot.errorbar
plt.errorbar(wl_region, flux_region, err_region,#the x-data, y-data, and y-axis error
marker=".",markersize="2",markerfacecolor='w', markeredgewidth=0,color="blue", #specifies the data points style
ecolor="dodgerblue",capsize=0.1) #specifies the format of lines and error bar
plt.title("STIS FUV Spectrum")
plt.xlabel('Wavelength [' + chr(197) +']')
plt.ylabel("Flux [ergs/s/cm$^2$/" + chr(197) +"]")
For more information on formatting the plots using matplotlib, see matplotlib.pyplot.plot, matplotlib.pyplot.errorbar.
Data quality flags are assigned to each pixel in the data quality extension. Each flag has a true (set) or false (unset) state. Flagged conditions are set as specific bits in a 16-bit integer word. For a single pixel, this allows for up to 15 data quality conditions to be flagged simultaneously, using the bitwise logical OR operation. Note that the data quality flags cannot be interpreted simply as integers but must be converted to base-2 and interpreted as flags. These flags are set and used during the course of calibration, and may likewise be interpreted and used by downstream analysis applications.
Make a histogram according to the data quality flags, and label the bins by what each data quality values actually means. More info: https://hst-docs.stsci.edu/stisdhb/chapter-2-stis-data-structure/2-5-error-and-data-quality-array
#First get the data quality flag from the x1d fits file, and convert them to log2 values
#The data quality flag is a masked array that "hides" the pixels with no data quality issue.
#We fill those "good points" with -1 in our case
x1d_dq = x1d_data[0]["DQ"].filled(0)
dq_bits = []
for dq in x1d_dq:
if dq == 0:
dq_bits.append(-1)
else:
for b in range(0,15):
if dq & (2**b):
dq_bits.append(b)
#Assign the meaning of each data quality and make the histogram
meanings = ["No Anomalies", "Error in the Reed Solomon decoding", "Lost data replaced by fill values",
"Bad detector pixel (e.g., bad column or row, mixed science and bias for overscan, or beyond aperture)",
"Data masked by occulting bar","Pixel having dark rate > 5 σ times the median dark level",
"Large blemish, depth > 40% of the normalized p-flat (repeller wire)","Vignetted pixel",
"Pixel in the overscan region",
"Saturated pixel, count rate at 90% of max possible—local non-linearity turns over and is multi-valued; \
pixels within 10% of turnover and all pixels within 4 pixels of that pixel are flagged.",
"Bad pixel in reference file","""Small blemish, depth between 40% and 70% of the normalized flat. \
Applies to MAMA and CCD p-flats""",
">30% of background pixels rejected by sigma-clip, or flagged, during 1-D spectral extraction",
"Extracted flux affected by bad input data",
"Data rejected in input pixel during image combination for cosmic ray rejection",
"Extracted flux not CTI corrected because gross counts are ≤ 0"]
flags = [int(2**i) for i in range(-1,15)]
bins_ = np.arange(-1, 15)
plt.title("Data quality flag frequencies")
h = plt.hist(dq_bits,bins=bins_)
x = plt.xticks(bins_+0.5, labels=flags, rotation='vertical', fontsize=15)
plt.show()
bits = ['0'*(15-len(bin(i)[2:]))+bin(i)[2:] for i in flags]
dq_table = pd.DataFrame({"FLAG Value":flags,"Bit Setting":bits,"Quality Condition Indicated":meanings})
pd.set_option('display.max_colwidth', None)
display(dq_table)
Through the calibaration pipeline, some data qualities are marked "serious". The value of serious data qualities are marked through "SDQFLAGS". We can decompose that value according to the bits in order to see the specific data quality flags that are marked serious.
sdqFlags_fuv=fits.getheader(x1d_file,1)["SDQFLAGS"]
print("The SDQFLAGS is {flag}, which is in binary {bin_flag},".format(flag=sdqFlags_fuv,bin_flag=bin(sdqFlags_fuv)[2:]))
print("Therefore the following data qualities are marked \"serious\":")
for i in range(15):
if 2**(i) & sdqFlags_fuv:
print("\t"+str(i+1)+":" +meanings[i+1])
We can now remove the data points with SDQ flags. In the plots below, the first one is the same as the spectrum in which we do not handle the SDQ flags. The second one is the spectrum with SDQ flagged data removed, and the SDQ flagged data points are marked with red '+'.
matplotlib.rcParams['figure.figsize'] = (20,10)
plt.style.use("bmh")
# First zoom in to the region where SDQ got removed
wl,flux,err,dq=x1d_data[0]["WAVELENGTH","FLUX","ERROR","DQ"]
zoom_mask = (wl > min(wl)) & (wl < 1520)
wl_zoom, flux_zoom, err_zoom, dq_zoom = wl[zoom_mask],flux[zoom_mask],err[zoom_mask],dq[zoom_mask]
# Filter the datapoints to where there are no serious DQ flags
mask_noSDQ = np.ones(len(dq_zoom),dtype=bool)
for i in range(0,len(dq_zoom)):
if dq_zoom[i] & sdqFlags_fuv:
mask_noSDQ[i]=False
# get the spectrum without SDQ using the mask we just created
wvln_noSDQ, flux_noSDQ,err_noSDQ = wl_zoom[mask_noSDQ], flux_zoom[mask_noSDQ],err_zoom[mask_noSDQ]
# inverse the _noSDQ mask to collect the data points with SDQ flags
mask_SDQ = [not elem for elem in mask_noSDQ]
wvln_SDQ, flux_SDQ = wl_zoom[mask_SDQ], flux_zoom[mask_SDQ]
# plot1: the spectrum with SDQ flagged data included
plt.subplot(2,1,1)
plt.plot(wl_zoom, flux_zoom, #the x-data, y-data, and y-axis error
marker=".",markersize="2",markerfacecolor='w', markeredgewidth=0) #specifies the data points style
plt.title("FUV with SDQ flagged data")
plt.xlabel('Wavelength [' + chr(197) +']')
plt.ylabel("Flux [ergs/s/cm$^2$/" + chr(197) +"]")
# plot2: plot the spectrum without SDQ flagged data, then mark the SDQ data points with +
plt.subplot(2,1,2)
# Plot the filtered datapoints
plt.plot(wvln_noSDQ, flux_noSDQ, #the x-data, y-data, and y-axis error
marker=".",markersize="2",markerfacecolor='w', markeredgewidth=0,label="SDQ removed spectrum")
plt.plot(wvln_SDQ, flux_SDQ,'r+',label="SDQ flagged data")
plt.legend()
# Format the figure
plt.title("FUV without SDQ flagged data")
plt.xlabel('Wavelength [' + chr(197) +']')
plt.ylabel("Flux [ergs/s/cm$^2$/" + chr(197) +"]")
plt.tight_layout()
The STIS images are stored as two-dimensional arrays in FITS image extension files. For more information on STIS image files and extension, see STIS FITS Image Extension Files
The rectified, wavelength and flux calibrated first order spectra or Geometrically corrected imaging data is stored in the fits file with the x2d extension. Similar to what we did to the x1d file, we first open the fits file to explore its file structure.
#read in the x2d file and get its info
x2d_file=Path("./data/mastDownload/HST/odj102010/odj102010_x2d.fits")
fits.info(x2d_file)
Similarly, we can also get the header from this fits file to see the scientific metadata.
#get header of the fits file
x2d_header = fits.getheader(x2d_file,0)
# get data as a numpy array
with fits.open(x2d_file) as hdu_list:
x2d_data = hdu_list[1].data
Make a histogram on the magnitude of the image data so that we have a general idea on the distribution. Knowing the distribution is essential for us to normalize the data when showing the image.
# remove infinities and nans from the data when making the histogram
filtered_data = [v for row in np.log10(x2d_data) for v in row if not (math.isinf(v) or math.isnan(v))]
plt.rcParams["figure.figsize"] = plt.rcParamsDefault["figure.figsize"]
plt.hist(filtered_data)
plt.xlabel("Magnitude order of _x2d image data")
plt.ylabel("count")
plt.title("_x2d Data Magnitude Order frequencies")
When showing the image, we normalize the color of each pixel to a specific range through vmin and vmax to make the features of image clear. These values typically matches the magnitude of the x2d data according to the histogram above, but can be experimented and changed to bring out features at different levels.
# show the image
matplotlib.rcParams['figure.figsize'] = (10,10)
plt.imshow(x2d_data,origin='lower',vmin=0,vmax=1e-13,cmap="viridis")
For more color map and normalizing options, see: Choosing Colormaps in Matplotlib, matplotlib.pyplot.imshow.
# get the serious data quality flag
sdqFlags_fuv=fits.getheader(x2d_file,1)["SDQFLAGS"]
# get data quality flags of each pixels
with fits.open(x2d_file) as hdu_list:
x2d_dq = hdu_list[3].data
# create a mask of bad pixels and set them to nan
def check_dq(dq):
return bool(dq & sdqFlags_fuv)
mask = np.vectorize(check_dq)(x2d_dq)
x2d_mask = np.ma.array(x2d_data,mask=mask,fill_value=np.nan)
# plot the image
plt.imshow(x2d_mask,origin='lower',vmin=0,vmax=1e-13,cmap="viridis")
The MAMA detecters have a unique Time-Tag mode besides ACCUM mode. TIME-TAG mode is used for high-time-resolution spectroscopy and imaging in the UV. In TIME-TAG mode, the position and detection time of every photon is recorded in an event list. The Time-Tag mode operation for the MAMA detectors can be found at: MAMA TIME-TAG Mode.
In TIME-TAG mode, the position and detection time of every photon is recorded in an event list. First collect the _tag data:
# Search target objscy by obs_id
target = Observations.query_criteria(obs_id='odgxt9010')
# get a list of files assiciated with that target
NUV_list = Observations.get_product_list(target)
# Download only the SCIENCE fits files
Observations.download_products(NUV_list,extension='fits',download_dir=str(datadir))
# get info about the tag extension fits file
tag = Path("./data/mastDownload/HST/odgxt9010/odgxt9010_tag.fits")
fits.info(tag)
The _tag fits file has two binary table extensions: EVENTS and GTI.
# get header of the EVENTS extension
# print only the TIMETAG EVENTS TABLE COLUMNS (line 130-147)
fits.getheader(tag,1)[130:147]
Columns in the EVENTS extension:
# get header of the GTI extension
fits.getheader(tag,2)
Table.read(tag,2)
Columns in the GTI extension:
Now we make a time series plot of the total flux over all wavelengths to see how the flux changes over the time interval:
# read the events data as a pandas dataframe for easier manipulation
events = Table.read(tag,1).to_pandas()
# get the good time interval from the GTI extension
start,stop = Table.read(tag,2)["START"],Table.read(tag,2)["STOP"]
# group the events by time with bin = 3 seconds
time = np.arange(int(start),int(stop),3)
ind = np.digitize(events['TIME'], time)
total_flux = events.groupby(ind).count()["TIME"]
# plot the flux as a function of time
# notice here that the unit of flux is counts since we are counting the number of events in a time series
matplotlib.rcParams['figure.figsize'] = (20,7)
plt.plot(time,total_flux,marker=".",mfc="w")
plt.xlabel("Time [s]")
plt.ylabel("Total Flux [counts]")
As the plot shows, though the total flux fluctuates, it is roughly a constant over the good time interval.
Time tag data can be converted into ACCUM image using the inttag method in stistools. More information: Error and Data Quality Array
# define the output file directory
accum = "./data/mastDownload/HST/odgxt9010/odgxt9010_accum.fits"
# convert Time_Tag into ACCUM
# the first parameter is the path to the _tag fits file, the second parameter is the output directory
stistools.inttag.inttag(tag, accum)
Then the output file is in the same structure as a STIS image fits file, which we can open and plot in the same way we explored above:
with fits.open(accum) as hdul:
im = hdul[1].data
plt.imshow(im,origin='lower',vmin=0,vmax=6,cmap="viridis")
inttag with multiple data sets: rcount specifies the number of imsets, imcrements specifies the time interval for each imsets in seconds
stistools.inttag.inttag(tag,accum, rcount = 3, increment = 700)
fits.info(accum)
Compare the 3 accum images produced by inttag, using the same scale and colormap:
for i in range(1,8,3):
with fits.open(accum) as hdul:
im = hdul[i].data
plt.subplot(1,3,int(i/3)+1)
plt.imshow(im,origin='lower',vmin=0,vmax=2,cmap="viridis")
plt.title("extension {}".format(i))
plt.tight_layout()
The output file is a series of extensions with each imset having a SCI, ERR, and DQ extension, as shown above.
The STIS Echelles mode uses diffraction and interference to separate a spectrum into different spectral orders (the keyword in the headers is SPORDER), with each spectral order covering different wavelength range. The echelles were designed to maximize the spectral range covered in a single echellogram. For more information, see Long-Slit Echelle Spectroscopy.
We first collect an Echelles data using astroquery:
# Search target objscy by obs_id
target = Observations.query_criteria(obs_id='OCTX01030')
# get a list of files assiciated with that target
NUV_list = Observations.get_product_list(target)
# Download only the SCIENCE fits files
Observations.download_products(NUV_list,extension='fits',download_dir=str(datadir))
Open the _flt (Flat-fielded science) image in the same way we open other image files, and show the image:
flt_file="./data/mastDownload/HST/octx01030/octx01030_flt.fits"
with fits.open(flt_file) as hdu:
image = hdu[1].data
plt.imshow(image,origin='lower',vmin=0,vmax=1,cmap="viridis")
As shown in the image above, there are multiple horizontal bands with different spectral orders. Each spectral order also has distinct wavelength range, which we will explore in the next section.
We first read in the _x1d data as an astropy table. Notice that when we read in the FUV _x1d data in section2.2, the table has a single row with SPORDER = 1. But for echelles mode data, the data is separated into multiple rows, with each row having a distinct order.
echelles_file="./data/mastDownload/HST/octx01030/octx01030_x1d.fits"
echelles_data = Table.read(echelles_file,1)
echelles_data
Now we can plot the spectrum of all spectral orders in one plot, with each spectral order having a distinct color:
# plot the spectrum; the color of each SPORDER is specified through a matplotlib built-in colormap
matplotlib.rcParams['figure.figsize'] = (20,7)
for i in range(len(echelles_data)):
plt.plot(echelles_data[i]['WAVELENGTH'],echelles_data[i]['FLUX'],color=cm.get_cmap('prism')(i/len(echelles_data)),label=echelles_data[i]['SPORDER'],alpha=0.7)
plt.xlabel('Wavelength [' + chr(197) +']')
plt.ylabel("Flux [ergs/s/cm$^2$/" + chr(197) +"]")
plt.legend(loc='best')
plt.title("STIS Echelles Mode Spectrum")
As the spectrum illustrates, each spectral order covers a specific wavelength range. Notice that the wavelength ranges could overlap with each other.
Author: Keyi Ding
Updated On: 2022-07-18
This tutorial was generated to be in compliance with the STScI style guides and would like to cite the Jupyter guide in particular.
Citations¶
If you use astropy, matplotlib, astroquery, or numpy for published research, please cite the
authors. Follow these links for more information about citations: